Olaf Hering [Sun, 20 Nov 2011 16:02:36 +0000 (17:02 +0100)]
xenpaging: remove xc_dominfo_t from paging_t
Remove xc_dominfo_t from paging_t, record only max_pages.
This value is used to setup internal data structures.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Sun, 20 Nov 2011 16:02:22 +0000 (17:02 +0100)]
xenpaging: update xenpaging_init
Move comment about xc_handle to the right place.
Allocate paging early and use calloc.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Sun, 20 Nov 2011 16:01:41 +0000 (17:01 +0100)]
xenpaging: print gfn in failure case
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Sun, 20 Nov 2011 16:01:39 +0000 (17:01 +0100)]
xenpaging: simplify file_op
Catch lseek() errors.
Use -1 as return value and let caller read errno.
Remove const casts from buffer pointers, the page is writeable.
Use wrapper for write() which matches the read() prototype.
Remove unused stdarg.h inclusion.
Remove unused macro.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Sun, 20 Nov 2011 16:01:32 +0000 (17:01 +0100)]
xenpaging: use PERROR to print errno
v3:
- adjust arguments for xc_mem_paging_enable() failures
v2:
- move changes to file_op() to different patch
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Sun, 20 Nov 2011 16:01:20 +0000 (17:01 +0100)]
xenpaging: remove obsolete comment in resume path
Remove stale comment.
If a page was populated several times the vcpu is paused and
xenpaging has to unpause it again.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
Olaf Hering [Sun, 20 Nov 2011 16:01:15 +0000 (17:01 +0100)]
xenpaging: remove filename from comment
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson.citrix.com>
Anil Madhavapeddy [Thu, 24 Nov 2011 19:09:55 +0000 (19:09 +0000)]
libvchan: fix segfault in client error path
In libvchan_client_init, go to the error path if the gntdev device is
not available. Otherwise, a segfault happens later as the vchan
context is invalid.
Signed-off-by: Anil Madhavapeddy <anil@recoil.org>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Thu, 24 Nov 2011 19:00:25 +0000 (19:00 +0000)]
tools/check: Add files missing from 24205:
5c88358164cc
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Thu, 24 Nov 2011 17:43:36 +0000 (17:43 +0000)]
tools: check for libaio unless user has configured CONFIG_SYSTEM_LIBAIO=n
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 17:05:25 +0000 (17:05 +0000)]
x86/mm: Fix liveness of pages in grant copy operations
We were immediately putting the p2m entry translation for grant
copy operations. This allowed for an unnecessary race by which the
page could have been swapped out between the p2m lookup and the actual
use. Hold on to the p2m entries until the grant operation finishes.
Also fixes a small bug: for the source page of the copy, get_page
was assuming the page was owned by the source domain. It may be a
shared page, since we don't perform an unsharing p2m lookup.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 17:05:25 +0000 (17:05 +0000)]
x86/mm: Ensure liveness of pages involved in a guest page table walk
Instead of risking deadlock by holding onto the gfn's acquired during
a guest page table walk, acquire an extra reference within the get_gfn/
put_gfn critical section, and drop the extra reference when done with
the map. This ensures liveness of the map, i.e. the underlying page
won't be paged out.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Ian Campbell [Thu, 24 Nov 2011 17:00:33 +0000 (17:00 +0000)]
libvchan: clean *.opic
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Thu, 24 Nov 2011 16:56:26 +0000 (17:56 +0100)]
x86: small fixes to pcpu platform op handling
XENPF_get_cpuinfo should init the flags output field rather than only
modify it.
XENPF_cpu_online must check for the input CPU number to be in range.
XENPF_cpu_offline must also do that, and should also reject attempts to
offline CPU 0 (this fails in cpu_down() too, but preventing this here
appears more correct given that the code here calls
continue_hypercall_on_cpu(0, ...), which would be flawed if cpu_down()
would ever allow bringing down CPU 0 (and a distinct error code is
easier to deal with when debugging issues).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 16:34:34 +0000 (16:34 +0000)]
x86/mm: ASSERT we are putting the right gfn in the XENMAPSPACE_gmfn* cases
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 16:34:34 +0000 (16:34 +0000)]
x86/mm: handle HVMOP_modified_memory on shared pages
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Signed-off-by: Adin Scannell <adin@scannell.ca>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 16:34:34 +0000 (16:34 +0000)]
x86/mm: fix domain-paging's interaction with log-dirty
Allow pages typed log dirty to be paged out, and the proper type to
restored when paging pages back in.
Signed-off-by: Andres lagar-Cavilla <andres@lagarcavilla.org>
Signed-off-by: Adin Scannell <adin@scannell.ca>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Keir Fraser [Thu, 24 Nov 2011 15:50:08 +0000 (15:50 +0000)]
x86/waitqueue: Because we have per-cpu stacks, we must wake up on teh
same cpu that we slept on. Otherwise stack references are bogus on
wakeup.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Thu, 24 Nov 2011 15:49:25 +0000 (15:49 +0000)]
waitqueue: Reorder prepare_to_wait() so that vcpu is definitely on the
queue on exit, even after a wakeup.
Otherwise, when we go round the loop in wait_event(), we may not
actually sleep after the first iteration, as we do not put ourselves
back on the queue on wakeup.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Thu, 24 Nov 2011 15:48:10 +0000 (15:48 +0000)]
waitqueue: Detect saved-stack overflow and crash the guest.
Signed-off-by: Keir Fraser <keir@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 15:45:19 +0000 (15:45 +0000)]
Properly compare "pci" token when groking serial port config
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Committed-by: Keir Fraser <keir@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 15:44:51 +0000 (15:44 +0000)]
Trivial fix for rc val in hap track dirty vram
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Committed-by: Keir Fraser <keir@xen.org>
Jean Guyader [Thu, 24 Nov 2011 15:43:59 +0000 (15:43 +0000)]
hvmloader: Intel GPU passthrough, reverse OpRegion
The Intel GPU uses a two pages NVS region called OpRegion.
In order to get full support for the driver in the guest
we need to map this region.
This patch reserves 2 pages on the top of the memory in the
reserved area and mark this region as NVS in the e820. Then
we write the address to the config space (offset 0xfc) so the
device model can map the OpRegion at this address in the guest.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 15:20:57 +0000 (15:20 +0000)]
x86/mm/p2m: don't overwrite m2p entry of still-shared pages
When updating a p2m mapping to shared, previous code
unconditionally set the m2p entry for the old mfn to invalid.
We now check that the old mfn does not remain shared.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 15:20:57 +0000 (15:20 +0000)]
x86/mm: change return code for log-dirty disabling
Disabling log dirty mode in HAP always returns -EINVAL. Make it
return the correct rc on success.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Signed-off-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Andres Lagar-Cavilla [Thu, 24 Nov 2011 15:20:57 +0000 (15:20 +0000)]
x86/mm/p2m: fix pod locking
The path p2m-lookup -> p2m-pt->get_entry -> 1GB PoD superpage ->
pod_demand_populate ends in the pod code performing a p2m_set_entry with
no locks held (in order to split the 1GB superpage into 512 2MB ones)
Further, it calls p2m_unlock after that, which will break the spinlock.
This patch attempts to fix that.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: George Dunlap <george.dunlap@eu.citrix.com>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Tim Deegan <tim@xen.org>
Jan Beulich [Thu, 24 Nov 2011 08:44:54 +0000 (09:44 +0100)]
ia64: build fixes (again)
This undoes a single change from c/s 24136:
3622d7fae14d
(common/grant_table.c) and several from c/s 24100:
be8daf78856a
(common/memory.c). It also completes the former with two previously
missing ia64 specific code adjustments. Authors Cc-ed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Paul Durrant [Wed, 23 Nov 2011 12:03:37 +0000 (12:03 +0000)]
libxl: Prevent xl save from segfaulting when control/shutdown key is removed
To acknowledge the tools' setting of control/shutdown it is normal for
PV drivers to rm the key. This leads to libxl__xs_read() returning
NULL and thus a subsequent strcmp on the return value will cause a
segfault.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Ian Jackson <ian.jackson@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Wed, 23 Nov 2011 11:15:31 +0000 (11:15 +0000)]
tools: use system libaio for blktap1 as well.
24184:
4ecd3615e726 missed this because I was accidentally testing with a
.config containing CONFIG_SYSTEM_LIBAIO=n. Tools tree now fully rebuilt
without this. There were no other issues.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 22 Nov 2011 18:51:13 +0000 (18:51 +0000)]
docs: remove some fatally out of date documentation
I think these are better off deleted than remaining to confuse people.
docs/misc/blkif-drivers-explained.txt:
- Talks about Xen 2.0 beta, talks about the old pre-xenstored IPC mechanism.
docs/misc/cpuid-config-for-guest.txt:
- Doesn't really say anything, in particular doesn't actually describe how to
configure CPUID.
docs/misc/hg-cheatsheet.txt:
- Not out of date per-se wrt mercural but talk about Xen 2.0 and gives URLs
under www.cl.cam.ac.uk. Talks a lot about bitkeeper. Given that mercurial is
hardly unusual anymore I think there must be better guides out there so this
one is not worth resurecting.
docs/misc/network_setup.txt:
- This is more comprehensively documented on the wiki these days.
docs/misc/VMX_changes.txt:
- Is basically a changelog from the initial implementation of VMX in 2004.
I'm not sure about some of the other docs, but these ones seemed fairly
obvious.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Tue, 22 Nov 2011 17:24:51 +0000 (17:24 +0000)]
tools: use system installed libaio by default.
I could have sworn I did this years ago.
IIRC the need for our own copy was due to the use of io_set_eventfd which is
not present in version 0.3.106. However it is in 0.3.107 the first version of
which was uploaded to Debian in June 2008 (I can't find a better reference for
the release date).
The necessary version is available in Debian Lenny onwards and is in at least
RHEL 6, Fedora 13 and OpenSuSE 11.3. The necessary version appears to not be
available in RHEL 5 or SLES 11 which is why I haven't simply nuked the in tree
version.
This is based on tools-system-libaio.diff from the Debian packaging although I
have made it optional (but default on).
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 22 Nov 2011 17:09:12 +0000 (17:09 +0000)]
Merge
Daniel De Graaf [Tue, 22 Nov 2011 17:07:32 +0000 (17:07 +0000)]
xenstore: xenbus cannot be opened read-only
In order to read keys from xenstore, the xenstore libraries need to
write the request to the xenbus socket. This means that the socket
cannot be opened read-only.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Jan Beulich [Tue, 22 Nov 2011 16:22:31 +0000 (17:22 +0100)]
move pci_find_ext_capability() into common PCI code
There's nothing architecture specific about it. It requires, however,
that x86-32's pci_conf_read32() tolerates register accesses above 255
(for consistency the adjustment is done to all pci_conf_readNN()
functions).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Stefano Stabellini [Tue, 22 Nov 2011 16:19:48 +0000 (16:19 +0000)]
libxl: Add a vkbd frontend/backend pair for HVM guests
Linux PV on HVM guests can use vkbd, so add a vkbd frontend/backend
pair for HVM guests by default. It is useful because it doesn't
require frequent qemu wakeups as the usb keyboard/mouse does.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Jackson [Tue, 22 Nov 2011 16:19:11 +0000 (16:19 +0000)]
Update QEMU_TAG
Keir Fraser [Tue, 22 Nov 2011 15:35:26 +0000 (15:35 +0000)]
debug: Add domain/vcpu pause_count info to 'd' key.
Signed-off-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Tue, 22 Nov 2011 13:29:48 +0000 (13:29 +0000)]
xsm/flask: fix resource list range checks
The FLASK security checks for resource ranges were not implemented
correctly - only the permissions on the endpoints of a range were
checked, instead of all items contained in the range. This would allow
certain resources (I/O ports, I/O memory) to be used by domains in
contravention to security policy.
This also corrects a bug where adding overlapping resource ranges did
not trigger an error.
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
Daniel De Graaf [Tue, 22 Nov 2011 13:29:01 +0000 (13:29 +0000)]
xsm/flask: Use correct flag to detect writable grant mappings
The flags passed to xsm_grant_mapref are the flags from the map
operation (GNTMAP_*), not status flags (GTF_*).
Signed-off-by: Daniel De Graaf <dgdegra@tycho.nsa.gov>
Committed-by: Keir Fraser <keir@xen.org>
Wei Wang [Tue, 22 Nov 2011 13:27:19 +0000 (13:27 +0000)]
amd iommu: Support INVALIDATE_IOMMU_ALL command.
It is one of the new architectural commands supported by iommu v2.
It instructs iommu to clear all address translation and interrupt
remapping caches for all devices and all domains.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
Wei Wang [Tue, 22 Nov 2011 13:26:46 +0000 (13:26 +0000)]
amd iommu: Factor out iommu command handling functions,
and move them into a new file.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
Wei Wang [Tue, 22 Nov 2011 13:26:11 +0000 (13:26 +0000)]
amd iommu: Fix incorrect definitions.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
Wei Wang [Tue, 22 Nov 2011 13:25:42 +0000 (13:25 +0000)]
amd iommu: Advertise iommu extended feature bits to xen.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Keir Fraser <keir@xen.org>
Keir Fraser [Tue, 22 Nov 2011 13:00:21 +0000 (13:00 +0000)]
x86,waitqueue: Allocate whole page for shadow stack.
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Tue, 22 Nov 2011 12:53:48 +0000 (12:53 +0000)]
x86,vmx: Remove broken and unused __vmptrst().
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Mon, 21 Nov 2011 21:28:34 +0000 (21:28 +0000)]
hvmloader: Fix memory relocation loop.
Signed-off-by: Keir Fraser <keir@xen.org>
Jan Beulich [Mon, 21 Nov 2011 08:29:31 +0000 (09:29 +0100)]
x86/vioapic: clear remote IRR when switching RTE to edge triggered mode
Xen itself (as much as Linux) relies on this behavior, so it should
also emulate it properly. Not doing so reportedly gets in the way of
kexec inside a HVM guest.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Olaf Hering <olaf@aepfle.de>
Keir Fraser [Sat, 19 Nov 2011 22:13:51 +0000 (22:13 +0000)]
x86: Fix RCU locking in XENMEM_add_to_physmap.
Signed-off-by: Keir Fraser <keir@xen.org>
Jean Guyader [Fri, 18 Nov 2011 13:43:26 +0000 (13:43 +0000)]
iommu: Introduce per cpu flag (iommu_dont_flush_iotlb) to avoid unnecessary iotlb flush
Add cpu flag that will be checked by the iommu low level code
to skip iotlb flushes. iommu_iotlb_flush shall be called explicitly.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Jean Guyader [Fri, 18 Nov 2011 13:42:46 +0000 (13:42 +0000)]
hvmloader: Change memory relocation loop when overlap with PCI hole
Change the way we relocate the memory page if they overlap with pci
hole. Use new map space (XENMAPSPACE_gmfn_range) to move the loop
into xen.
This code usually get triggered when a device is pass through to a
guest and the PCI hole has to be extended to have enough room to map
the device BARs. The PCI hole will starts lower and it might overlap
with some RAM that has been alocated for the guest. That usually
happen if the guest has more than 4G of RAM. We have to relocate
those pages in high mem otherwise they won't be accessible.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Jean Guyader [Fri, 18 Nov 2011 13:42:08 +0000 (13:42 +0000)]
mm: New XENMEM space, XENMAPSPACE_gmfn_range
XENMAPSPACE_gmfn_range is like XENMAPSPACE_gmfn but it runs on
a range of pages. The size of the range is defined in a new field.
This new field .size is located in the 16 bits padding between .domid
and .space in struct xen_add_to_physmap to stay compatible with older
versions.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Jean Guyader [Fri, 18 Nov 2011 13:41:33 +0000 (13:41 +0000)]
add_to_physmap: Move the code for XENMEM_add_to_physmap
Move the code for the XENMEM_add_to_physmap case into it's own
function (xenmem_add_to_physmap).
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Jean Guyader [Fri, 18 Nov 2011 13:40:56 +0000 (13:40 +0000)]
iommu: Introduce iommu_flush and iommu_flush_all.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Jean Guyader [Fri, 18 Nov 2011 13:40:19 +0000 (13:40 +0000)]
vtd: Refactor iotlb flush code
Factorize the iotlb flush code from map_page and unmap_page into
it's own function.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Juergen Gross [Fri, 18 Nov 2011 13:34:43 +0000 (13:34 +0000)]
sched_sedf: Avoid panic when adjusting sedf parameters
When using sedf scheduler in a cpupool the system might panic when
setting sedf scheduling parameters for a domain. Introduces
for_each_domain_in_cpupool macro as it is usable 4 times now. Add
appropriate locking in cpupool_unassign_cpu().
Signed-off-by: Juergen Gross <juergen.gross@ts.fujitsu.com>
Committed-by: Keir Fraser <keir@xen.org>
Paul Durrant [Fri, 18 Nov 2011 13:32:50 +0000 (13:32 +0000)]
hvmloader: Add configuration options to selectively disable S3 and S4 ACPI power states.
Introduce acpi_s3 and acpi_s4 configuration options (default=1). The
S3 and S4 packages are moved into separate SSDTs and their inclusion
is controlled by the new configuration options.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Paul Durrant [Fri, 18 Nov 2011 13:31:43 +0000 (13:31 +0000)]
hvmloader: Move acpi_enabled out of hvm_info_table into xenstore
Since hvmloader has a xentore client, use a platform key in xenstore
to indicate whether ACPI is enabled or not rather than the shared
hvm_info_table structure.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 18 Nov 2011 08:22:45 +0000 (09:22 +0100)]
x86/xsave: provide guests with finit-like environment
Without the use of xsave, guests get their initial floating point
environment set up with finit. At least NetWare actually depends on
this (in particular on all exceptions being masked), so to be
consistent set the same environment also when using xsave. This is
also in line with all SSE exceptions getting masked initially.
To avoid further fragile casts in xstate_alloc_save_area() the patch
also changes xsave_struct's fpu_see member to have actually usable
fields.
The patch was tested in its technically identical, but modified-file-
wise different 4.1.2 version.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Charles Arnold <carnold@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 18 Nov 2011 08:21:24 +0000 (09:21 +0100)]
x86/IRQ: prevent vector sharing within IO-APICs
Following the prevention of vector sharing for MSIs, this change
enforces the same within IO-APICs: Pin based interrupts use the IO-APIC
as their identifying device under the AMD IOMMU (and just like for
MSIs, only the identifying device is used to remap interrupts here,
with no regard to an interrupt's destination).
Additionally, LAPIC initiated EOIs (for level triggered interrupts) too
use only the vector for identifying which interrupts to end. While this
generally causes no significant problem (at worst an interrupt would be
re-raised without a new interrupt event actually having occurred), it
still seems better to avoid the situation.
For this second aspect, a distinction is being made between the
traditional and the directed-EOI cases: In the former, vectors should
not be shared throughout all IO-APICs in the system, while in the
latter case only individual IO-APICs need to be contrained (or, if the
firmware indicates so, sub- groups of them having the same GSI appear
at multiple pins).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Jan Beulich [Fri, 18 Nov 2011 08:18:41 +0000 (09:18 +0100)]
x86/IO-APIC: refine EOI-ing of migrating level interrupts
Rather than going through all IO-APICs and calling io_apic_eoi_vector()
for the vector in question, just use eoi_IO_APIC_irq().
This in turn allows to eliminate quite a bit of other code.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
Keir Fraser [Wed, 16 Nov 2011 18:21:14 +0000 (18:21 +0000)]
elf: Fix Elf64 types and structs to match the specification.
The layouts were actually correct, but the type names were a bit
messed up.
Original patch by Volker Eckert <volker.eckert@citrix.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 16 Nov 2011 16:04:31 +0000 (16:04 +0000)]
x86/emulator: add feature checks for newer instructions
Certain instructions were introduced only after the i686 or original
x86-64 architecture, so we should not try to emulate them if the guest
is not seeing the respective feature enabled (or, worse, if the
underlying hardware doesn't support them). This affects fisttp,
movnti, and cmpxchg16b.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 16 Nov 2011 15:50:55 +0000 (15:50 +0000)]
test_x86_emulator: add a "run" target to the test code makefile
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
Keir Fraser [Wed, 16 Nov 2011 15:48:49 +0000 (15:48 +0000)]
x86_emulate: Define and use BUG() and bool_t.
Original patch by Jan Beulich <jbeulich@suse.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Keir Fraser [Wed, 16 Nov 2011 15:28:55 +0000 (15:28 +0000)]
test_x86_emulate: Get public Xen headers via tools/include.
Signed-off-by: Keir Fraser <keir@xen.org>
Jan Beulich [Wed, 16 Nov 2011 15:20:25 +0000 (15:20 +0000)]
x86-64/test_x86_emulate: fix blowfish test
Incorrect register usage in the _start() wrapper caused the 64-bit
execution emulation to fail.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
Gianluca Guida [Wed, 16 Nov 2011 15:19:33 +0000 (15:19 +0000)]
[shadow] Disable higher level pagetables early unshadow only when the "process dying" hypercall is used.
This patch fixes a performance problem in fully virtualized guests.
Signed-off-by: Gianluca Guida <gianluca.guida@citrix.com>
Tested-by: Jan Beulich <jbeulich@suse.com>
Committed-by: Keir Fraser <keir@xen.org>
Stefano Stabellini [Wed, 16 Nov 2011 15:17:37 +0000 (15:17 +0000)]
hvm: introduce HVM_PARAM_BUFIOREQ_EVTCHN
Introduce an event channel for buffered io event notifications,
advertise the port number using an hvm param. This way the device
model is not forced to check the buffered io page for data several
times a second for the entire life of the VM (buffered io is mostly
used for stdvga emulation in Xen that is switched off after the guest
goes into graphical mode).
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Committed-by: Keir Fraser <keir@xen.org>
Andrew Cooper [Tue, 15 Nov 2011 13:50:18 +0000 (14:50 +0100)]
KEXEC cleanup: IA64 specific functions should not live in generic header files
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Tue, 15 Nov 2011 13:47:41 +0000 (14:47 +0100)]
ia64: fix the build
This addresses all remaining build problems introduced over the last
several months.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Ian Campbell [Tue, 15 Nov 2011 13:24:38 +0000 (14:24 +0100)]
Keir Fraser [Mon, 14 Nov 2011 20:15:35 +0000 (20:15 +0000)]
hvmloader: Move acpi_info structure out from low memory.
This avoids a conflict with SeaBIOS's memory management. Moreover
there is no reason that acp_info must live below 1MB, and moving it
out actually simplifies our code.
Signed-off-by: Keir Fraser <keir@xen.org>
Roger Pau Monne [Mon, 14 Nov 2011 18:17:44 +0000 (18:17 +0000)]
tools/check: check for headers and libraries in user defined folders.
Parse EXTRA_INCLUDES, EXTRA_LIB, PREPEND_INCLUDES, PREPEND_LIB,
APPEND_INCLUDES, APPEND_LIB during checks, to search for required
files.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Roger Pau Monne [Mon, 14 Nov 2011 18:14:07 +0000 (18:14 +0000)]
tools/build: Introduce {PREPEND,APPEND}_{LIB,INCLUDES}
Create two new variables called APPEND_ and PREPEND_ to add compile
flags at the beginning or at the end of the search path.
Added a new semantic for user defined compile flags, here is the list
of possible options:
PREPEND_LIB: add libraries to the search path before xen
(before xen installation folders).
PREPEND_INCLUDES: add headers to the search path before xen
(before xen installation folders).
APPEND_LIB: add libraries to the search path at the end
(after all xen installation folders have been added).
APPEND_INCLUDES: add libraries to the search path at the end
(after all xen installation folders have been added).
EXTRA_INCLUDES and EXTRA_LIB can still be used, and they will have the
same effect as PREPEND_INCLUDES and PREPEND_LIB.
Signed-off-by: Roger Pau Monne <roger.pau@entel.upc.edu>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Konrad Rzeszutek Wilk [Mon, 14 Nov 2011 17:54:54 +0000 (17:54 +0000)]
tools: xend: tolerate empty state/*.xml
Bugzilla 1680: Xend fails to start if /var/lib/xend/state/*.xml are empty
which I get often when replacing the Xen hypervisor with a newer version.
This can be easily be reproduced under Fedora Core 16 by installing
xen RPMs and then replacing the xen.gz with a newer version.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Anthony Low <shinji@pikopiko.org>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Ian Campbell [Mon, 14 Nov 2011 17:50:53 +0000 (17:50 +0000)]
docs: report if we do not build a doc due to lack of the necessary tool
Previously only some targets did this. An alternative would be to make a hard
dependency on these tools, this might make more sense especially for markdown?
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Olaf Hering [Mon, 14 Nov 2011 17:49:14 +0000 (17:49 +0000)]
xenpaging: munmap all pages after page-in
Do munmap() on all mapped pages, not just the first one. Without this
change the gfns backing the remaining pages can not be paged out again
because the page count does not go down to 1. This change was missing
from changeset 23827:
d1d6abc1db20.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Committed-by: Ian Jackson <ian.jackson@eu.citrix.com>
Andrew Cooper [Fri, 11 Nov 2011 18:14:35 +0000 (18:14 +0000)]
Revert c/s 23666:
b96f8bdcaa15 KEXEC: disconnect all PCI devices from the PCI bus on crash
It turns out that this causes all mannor of problems on certain
motherboards (so far with no pattern I can discern)
Problems include:
* Hanging forever checking hlt instruction.
* Panics when trying to change switch root device
* Drivers hanging when trying to check for interrupts.
From: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Keir Fraser <keir@xen.org>
Committed-by: Keir Fraser <keir@xen.org>
Andres Lagar-Cavilla [Fri, 11 Nov 2011 18:11:34 +0000 (18:11 +0000)]
Modify naming of queries into the p2m
Callers of lookups into the p2m code are now variants of get_gfn. All
callers need to call put_gfn. The code behind it is a no-op at the
moment, but will change to proper locking in a later patch.
This patch does not change functionality. Only naming, and adds
put_gfn's.
set_p2m_entry retains its name because it is always called with
p2m_lock held.
This patch is humongous, unfortunately, given the dozens of call sites
involved.
After this patch, anyone using old style gfn_to_mfn will not succeed
in compiling their code. This is on purpose: adapt to the new API.
Signed-off-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Acked-by: Tim Deegan <tim@xen.org>
Committed-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 11 Nov 2011 16:46:19 +0000 (17:46 +0100)]
ia64: introduce atomic_{read,write}NN()
These are required to be able to build certain portions of common code.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Lasse Collin [Fri, 11 Nov 2011 13:35:51 +0000 (14:35 +0100)]
Decompressors: check input size in unlzo.c
From: Lasse Collin <lasse.collin@tukaani.org>
The code assumes that the input is valid and not truncated. Add checks to
avoid reading past the end of the input buffer. Change the type of "skip"
from u8 to int to fix a possible integer overflow.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Lasse Collin [Fri, 11 Nov 2011 13:35:05 +0000 (14:35 +0100)]
Decompressors: check for write errors in unlzo.c
From: Lasse Collin <lasse.collin@tukaani.org>
The return value of flush() is not checked in unlzo(). This means that
the decompressor won't stop even if the caller doesn't want more data.
This can happen e.g. with a corrupt LZO-compressed initramfs image.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Lasse Collin [Fri, 11 Nov 2011 13:34:24 +0000 (14:34 +0100)]
Decompressors: validate match distance in unlzma.c
From: Lasse Collin <lasse.collin@tukaani.org>
Validate the newly decoded distance (rep0) in process_bit1(). This is to
detect corrupt LZMA data quickly. The old code can run for long time
producing garbage until it hits the end of the input.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Lasse Collin [Fri, 11 Nov 2011 13:33:30 +0000 (14:33 +0100)]
Decompressors: check for write errors in unlzma.c
From: Lasse Collin <lasse.collin@tukaani.org>
The return value of wr->flush() is not checked in write_byte(). This
means that the decompressor won't stop even if the caller doesn't want
more data. This can happen e.g. with corrupt LZMA-compressed initramfs.
Returning the error quickly allows the user to see the error message
quicker.
There is a similar missing check for wr.flush() near the end of unlzma().
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Lasse Collin [Fri, 11 Nov 2011 13:32:57 +0000 (14:32 +0100)]
Decompressors: check for read errors in unlzma.c
From: Lasse Collin <lasse.collin@tukaani.org>
Return value of rc->fill() is checked in rc_read() and error() is called
when needed, but then the code continues as if nothing had happened.
rc_read() is a void function and it's on the top of performance critical
call stacks, so propagating the error code via return values doesn't sound
like the best fix. It seems better to check rc->buffer_size (which holds
the return value of rc->fill()) in the main loop. It does nothing bad
that the code runs a little with unknown data after a failed rc->fill().
This fixes an infinite loop in initramfs decompression if the
LZMA-compressed initramfs image is corrupt.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Lasse Collin [Fri, 11 Nov 2011 13:32:03 +0000 (14:32 +0100)]
Decompressors: fix header validation in unlzma.c
From: Lasse Collin <lasse.collin@tukaani.org>
Validation of header.pos calls error() but doesn't make the function
return to indicate an error to the caller. Instead the decoding is
attempted with invalid header.pos. This fixes it.
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Lasse Collin [Fri, 11 Nov 2011 13:31:38 +0000 (14:31 +0100)]
Decompressors: remove unused function from unlzma.c
From: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Phillip Lougher [Fri, 11 Nov 2011 13:30:36 +0000 (14:30 +0100)]
bzip2: Add missing checks for malloc returning NULL
From: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Phillip Lougher <phillip@lougher.demon.co.uk>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Lasse Collin [Fri, 11 Nov 2011 13:29:21 +0000 (14:29 +0100)]
Decompressors: get rid of set_error_fn() macro
From: Lasse Collin <lasse.collin@tukaani.org>
set_error_fn() is a useless complication. Only unlzma.c had some use
for it and that was easy to change too.
This also gets rid of the static function pointer "error".
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 11 Nov 2011 13:27:41 +0000 (14:27 +0100)]
multicall: don't ignore failure from __copy_to_guest() upon preemption
At once adjust perf counter updates to also count calls from here even
if a guest memory access failed.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 11 Nov 2011 13:26:48 +0000 (14:26 +0100)]
x86/amd-ucode: further turn down verbosity
Turn up the log level on various (mostly debug-only) messages.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Jan Beulich [Fri, 11 Nov 2011 13:25:16 +0000 (14:25 +0100)]
x86: quiesce cpuidle code
So far these messages got pointlessly (as the code in other places
assumes symmetric configuration) emitted once per CPU. Hide the debug
one behind opt_cpu_info, and issue the info one just once (if the code
gets adjusted to support assymtric configurations, this would need to
be revisited, but ideally without producing per-CPU messages again).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Wei Wang [Fri, 11 Nov 2011 11:06:01 +0000 (12:06 +0100)]
amd iommu: Introduce iommu_has_cap() function
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
Wei Wang [Fri, 11 Nov 2011 11:05:14 +0000 (12:05 +0100)]
amd iommu: Compress hyper-transport flags into a single byte
These flags are single bit, no need to be saved as integers.
Add 3 inline helpers to make single bit access easier.
Introduce iommu_has_ht_flag and set_iommu_ht_flags
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
Wei Wang [Fri, 11 Nov 2011 11:04:41 +0000 (12:04 +0100)]
amd iommu: Disable debug output for early DTE update
Some systems may have IVHD device entries that cover large device id range.
Having those entries displayed will take very long time to boot.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
Wei Wang [Fri, 11 Nov 2011 11:04:10 +0000 (12:04 +0100)]
amd iommu: Simplify IVHD device flag handling
These bits are aligned to corresponding fields in device table entry. They
can be updated by a single device entry write.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
Wei Wang [Fri, 11 Nov 2011 11:03:21 +0000 (12:03 +0100)]
amd iommu: Cleanup iommu pci capabilites detection
* Define new structure to represent capability block.
* Remove unnecessary read for unused information.
* Add sanity check into get_iommu_capabilities.
* iommu capability offset is 16 bit not 8 bit, fix that.
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
Wei Wang [Fri, 11 Nov 2011 11:01:55 +0000 (12:01 +0100)]
amd iommu: Use pci access function to detect msi capabilities
Signed-off-by: Wei Wang <wei.wang2@amd.com>
Committed-by: Jan Beulich <jbeulich@suse.com>
Jean Guyader [Fri, 11 Nov 2011 09:14:22 +0000 (10:14 +0100)]
Hypercall continuation cancelation in compat mode for XENMEM_get/set_pod_target
If copy_to_guest failed in the compat code after a continuation as been
done in the native code we need to cancel it so we won't reexecute the
hypercall but return from the hypercall with the appropriate error.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Committed-by: Jan Beulich <jbeulich@suse.com>
Jan Beulich [Fri, 11 Nov 2011 08:47:40 +0000 (09:47 +0100)]
x86/IRQ: eliminate irq_vector[]
The vector is already being tracked in struct irq_desc's arch.vector
member, so there's no real need for a second place where this to get
stored. The only caveat is that legacy vectors (used for interrupts
handled through the 8259) must be special cased to not prevent non-
legacy vectors from being assigned.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Acked-by: Keir Fraser <keir@xen.org>
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>